Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 10 de 10
Filtrar
Más filtros










Intervalo de año de publicación
1.
Indian J Med Res ; 151(1): 93-103, 2020 01.
Artículo en Inglés | MEDLINE | ID: mdl-32134020

RESUMEN

Background & objectives: For bacterial community analysis, 16S rRNA sequences are subjected to taxonomic classification through comparison with one of the three commonly used databases [Greengenes, SILVA and Ribosomal Database Project (RDP)]. It was hypothesized that a unified database containing fully annotated, non-redundant sequences from all the three databases, might provide better taxonomic classification during analysis of 16S rRNA sequence data. Hence, a unified 16S rRNA database was constructed and its performance was assessed by using it with four different taxonomic assignment methods, and for data from various hypervariable regions (HVRs) of 16S rRNA gene. Methods: We constructed a unified 16S rRNA database (16S-UDb) by merging non-ambiguous, fully annotated, full-length 16S rRNA sequences from the three databases and compared its performance in taxonomy assignment with that of three original databases. This was done using four different taxonomy assignment methods [mothur Naïve Bayesian Classifier (mothur-nbc), RDP Naïve Bayesian Classifier (rdp-nbc), UCLUST, SortMeRNA] and data from 13 regions of 16S rRNA [seven hypervariable regions (HVR) (V2-V8) and six pairs of adjacent HVRs]. Results: Our unified 16S rRNA database contained 13,078 full-length, fully annotated 16S rRNA sequences. It could assign genus and species to larger proportions (90.05 and 46.82%, respectively, when used with mothur-nbc classifier and the V2+V3 region) of sequences in the test database than the three original 16S rRNA databases (70.88-87.20% and 10.23-24.28%, respectively, with the same classifier and region). Interpretation & conclusions: Our results indicate that for analysis of bacterial mixtures, sequencing of V2-V3 region of 16S rRNA followed by analysis of the data using the mothur-nbc classifier and our 16S-UDb database may be preferred.


Asunto(s)
Bacterias/genética , Clasificación , Microbioma Gastrointestinal/genética , ARN Ribosómico 16S/genética , Bacterias/clasificación , Humanos , Metagenómica/clasificación , Filogenia , Análisis de Secuencia de ADN
2.
IEEE Trans Nanobioscience ; 18(3): 273-282, 2019 07.
Artículo en Inglés | MEDLINE | ID: mdl-31021803

RESUMEN

High-throughput sequencing techniques have accelerated functional metagenomics studies through the generation of large volumes of omics data. The integration of these data using computational approaches is potentially useful for predicting metagenomic functions. Machine learning (ML) models can be trained using microbial features which are then used to classify microbial data into different functional classes. For example, ML analyses over the human microbiome data has been linked to the prediction of important biological states. For analysing omics data, integrating abundance count of taxonomical features with their biological relationships is important. These relationships can potentially be uncovered from the phylogenetic tree of microbial taxa. In this paper, we propose a novel integrative framework Phy-PMRFI. This framework is driven by the phylogeny-based modeling of omics data to predict metagenomic functions using important features selected by a random forest importance (RFI) strategy. The proposed framework integrates the underlying phylogenetic tree information with abundance measures of microbial species (features) by creating a novel phylogeny and abundance aware matrix structure (PAAM). Phy-PMRFI progresses by ranking the microbial features using an RFI measure. This is then used as input for microbiome classification. The resultant feature set enhances the performance of the state-of-art methods such as support vector machines. Our proposed integrative framework also outperforms the state-of-the-art pipeline of phylogenetic isometric log-ratio transform (PhILR) and MetaPhyl. Prediction accuracy of 90 % is obtained with Phy-PMRFI over human throat microbiome in comparison to other approaches of PhILR with 53% and MetaPhyl with 71% accuracy.


Asunto(s)
Árboles de Decisión , Metagenoma/genética , Metagenómica , Máquina de Vectores de Soporte , Algoritmos , Bases de Datos Genéticas , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Metagenómica/clasificación , Metagenómica/métodos , Microbiota/genética , Faringe/microbiología , Filogenia
3.
Nat Commun ; 9(1): 4894, 2018 11 20.
Artículo en Inglés | MEDLINE | ID: mdl-30459421

RESUMEN

Citrus is a globally important, perennial fruit crop whose rhizosphere microbiome is thought to play an important role in promoting citrus growth and health. Here, we report a comprehensive analysis of the structural and functional composition of the citrus rhizosphere microbiome. We use both amplicon and deep shotgun metagenomic sequencing of bulk soil and rhizosphere samples collected across distinct biogeographical regions from six continents. Predominant taxa include Proteobacteria, Actinobacteria, Acidobacteria and Bacteroidetes. The core citrus rhizosphere microbiome comprises Pseudomonas, Agrobacterium, Cupriavidus, Bradyrhizobium, Rhizobium, Mesorhizobium, Burkholderia, Cellvibrio, Sphingomonas, Variovorax and Paraburkholderia, some of which are potential plant beneficial microbes. We also identify over-represented microbial functional traits mediating plant-microbe and microbe-microbe interactions, nutrition acquisition and plant growth promotion in citrus rhizosphere. The results provide valuable information to guide microbial isolation and culturing and, potentially, to harness the power of the microbiome to improve plant production and health.


Asunto(s)
Citrus/microbiología , Microbiota/genética , Raíces de Plantas/microbiología , Rizosfera , Microbiología del Suelo , Bacterias/clasificación , Bacterias/genética , ADN Espaciador Ribosómico/genética , Hongos/clasificación , Hongos/genética , Metagenoma/genética , Metagenómica/clasificación , Metagenómica/métodos , Filogenia , ARN Ribosómico 16S/genética
4.
Univ. sci ; 22(1): 87-96, Jan.-Apr. 2017. ilus, tab
Artículo en Inglés | LILACS, COLNAL | ID: biblio-904707

RESUMEN

Abstract Soil is a large source of microorganisms with potential to produce bioactive compounds. Since most of them cannot be cultured, metagenomics has become a useful tool in order to evaluate this potential. The aim of this study was to screen biosynthetic polyketide genes (PKS) present in a metagenomic library constructed from a soil sample isolated from the Brazilian Atlantic Forest. The library comprises 5000 clones with DNA inserts between 40 and 50 Kb. The characterization of the biosynthetic gene clusters of these molecules is a promising alternative to elucidate the biotechnological potential of bioactive compounds in microbial communities. The PKS genes were screened using degenerated primers. The positive clones for PKS systems were isolated, and their nucleotide sequences analysed with bioinformatics tools. The screening yielded two positive clones for PKS II genes. Furthermore, variations in the sequences of the PKS II genes from the metagenomic library were observed when compared with sequences of ketosynthases' databases. With these findings we gain insight into the possible relation between new biosynthetic genes and the production of new secondary metabolites.


Resumen El suelo es una fuente importante de microrganismos con potencial para producir compuestos bioactivos. Dado que la gran mayoría de estos microorganismos no puede cultivarse, la metagenómica se ha convertido en una herramienta útil para evaluar dicho potencial. El objetivo del presente estudio fue evaluar los genes biosintéticos de policétidos (PKS) presentes en una biblioteca metagenómica construida a partir de una muestra de suelo aislada de la selva atlántica brasileña. La biblioteca comprende 5000 clones con insertos de DNA entre 40 y 50 Kb. La caracterización de clústeres de genes biosintéticos de estas moléculas es una alternativa promisoria para elucidar el potencial biotecnológico de los compuestos bioactivos en comunidades microbianas. Los genes biosintéticos de PKS se evaluaron usando cebadores degenerados. Se aislaron los clones positivos para sistemas PKS y sus secuencias de nucleótidos se analizaron con herramientas bioinformáticas. La evaluación arrojó dos clones positivos para genes de PKS II. Además, se observaron variaciones en las secuencias de genes de PKS II de la biblioteca metagenómica cuando se compararon con las secuencias de las bases de datos de cetosintasas. Estos hallazgos proporcionan nueva información sobre la posible relación entre nuevos genes biosintéticos y la producción de nuevos metabolitos secundarios.


Resumo O solo é uma fonte de importante de microrganismos com potencial para produzir compostos bioativos. Considerando que a maioria destes microrganismos não se pode cultivar, a metagenômica tem se convertido em uma ferramenta útil para avaliar este potencial. O objetivo deste estudo foi avaliar os genes biossintéticos de policetídeos (PKS) presentes em uma biblioteca metagenômica construída a partir de uma amostra de solo isolada da Mata Atlântica brasileira. A biblioteca compreende 5000 clones com insertos de DNA entre 40 e 50 Kb. A caracterização de clusters de genes biossintéticos destas moléculas é uma alternativa promissora para elucidar o potencial biotecnológico de compostos bioativos em comunidades microbianas. Os genes PKS foram avaliados usando primers degenerados. Os clones positivos para sistemas PKS foram isolados e suas sequências de nucleotídeos foram analisadas com ferramentas de bioinformática. A avaliação forneceu dois clones positivos para genes PKS II. Além disso, variações nas sequências dos genes PKS II da biblioteca metagenômica foram observadas quando comparadas com sequências da base de dados de cetosintases. Com estas descobertas obtivemos uma visão sobre uma possível relação entre novos genes biossintéticos e a produção de novos metabólitos secundários.


Asunto(s)
Metagenómica/clasificación , Policétidos/análisis
6.
Genome Res ; 26(12): 1721-1729, 2016 12.
Artículo en Inglés | MEDLINE | ID: mdl-27852649

RESUMEN

Centrifuge is a novel microbial classification engine that enables rapid, accurate, and sensitive labeling of reads and quantification of species on desktop computers. The system uses an indexing scheme based on the Burrows-Wheeler transform (BWT) and the Ferragina-Manzini (FM) index, optimized specifically for the metagenomic classification problem. Centrifuge requires a relatively small index (4.2 GB for 4078 bacterial and 200 archaeal genomes) and classifies sequences at very high speed, allowing it to process the millions of reads from a typical high-throughput DNA sequencing run within a few minutes. Together, these advances enable timely and accurate analysis of large metagenomics data sets on conventional desktop computers. Because of its space-optimized indexing schemes, Centrifuge also makes it possible to index the entire NCBI nonredundant nucleotide sequence database (a total of 109 billion bases) with an index size of 69 GB, in contrast to k-mer-based indexing schemes, which require far more extensive space.


Asunto(s)
Archaea/clasificación , Bacterias/clasificación , Metagenómica/clasificación , Algoritmos , Archaea/genética , Bacterias/genética , Biología Computacional/métodos , Bases de Datos de Ácidos Nucleicos , Secuenciación de Nucleótidos de Alto Rendimiento , Análisis de Secuencia de ADN
7.
Nat Commun ; 7: 11257, 2016 Apr 13.
Artículo en Inglés | MEDLINE | ID: mdl-27071849

RESUMEN

Metagenomics emerged as an important field of research not only in microbial ecology but also for human health and disease, and metagenomic studies are performed on increasingly larger scales. While recent taxonomic classification programs achieve high speed by comparing genomic k-mers, they often lack sensitivity for overcoming evolutionary divergence, so that large fractions of the metagenomic reads remain unclassified. Here we present the novel metagenome classifier Kaiju, which finds maximum (in-)exact matches on the protein-level using the Burrows-Wheeler transform. We show in a genome exclusion benchmark that Kaiju classifies reads with higher sensitivity and similar precision compared with current k-mer-based classifiers, especially in genera that are underrepresented in reference databases. We also demonstrate that Kaiju classifies up to 10 times more reads in real metagenomes. Kaiju can process millions of reads per minute and can run on a standard PC. Source code and web server are available at http://kaiju.binf.ku.dk.


Asunto(s)
Algoritmos , Clasificación , Metagenómica/clasificación , Secuencia de Aminoácidos , Animales , Humanos , Metagenoma , Proteínas/química
8.
Bioinformatics ; 31(22): 3584-92, 2015 Nov 15.
Artículo en Inglés | MEDLINE | ID: mdl-26209798

RESUMEN

MOTIVATION: Metagenomics is a powerful approach to study genetic content of environmental samples, which has been strongly promoted by next-generation sequencing technologies. To cope with massive data involved in modern metagenomic projects, recent tools rely on the analysis of k-mers shared between the read to be classified and sampled reference genomes. RESULTS: Within this general framework, we show that spaced seeds provide a significant improvement of classification accuracy, as opposed to traditional contiguous k-mers. We support this thesis through a series of different computational experiments, including simulations of large-scale metagenomic projects.Availability and implementation, Supplementary information: Scripts and programs used in this study, as well as supplementary material, are available from http://github.com/gregorykucherov/spaced-seeds-for-metagenomics. CONTACT: gregory.kucherov@univ-mlv.fr.


Asunto(s)
Algoritmos , Metagenómica/clasificación , Bacillus/genética , Bases de Datos Genéticas , Genoma Bacteriano , Mycobacterium/genética , Probabilidad , Alineación de Secuencia , Estadísticas no Paramétricas
9.
Bioinformatics ; 30(1): 17-23, 2014 Jan 01.
Artículo en Inglés | MEDLINE | ID: mdl-23645816

RESUMEN

MOTIVATION: TANGO is one of the most accurate tools for the taxonomic assignment of sequence reads. However, because of the differences in the taxonomy structures, performing a taxonomic assignment on different reference taxonomies will produce divergent results. RESULTS: We have improved the TANGO pipeline to be able to perform the taxonomic assignment of a metagenomic sample using alternative reference taxonomies, coming from different sources. We highlight the novel pre-processing step, necessary to accomplish this task, and describe the improvements in the assignment process. We present the new TANGO pipeline in details, and, finally, we show its performance on four real metagenomic datasets and also on synthetic datasets. AVAILABILITY: The new version of TANGO, including implementation improvements and novel developments to perform the assignment on different reference taxonomies, is freely available at http://sourceforge.net/projects/taxoassignment/.


Asunto(s)
Metagenómica/métodos , Programas Informáticos , Algoritmos , Metagenómica/clasificación
10.
Genomics ; 96(1): 27-38, 2010 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-20338234

RESUMEN

The activities of prokaryotes are pivotal in shaping the environment, and at the same time are greatly influenced by the environment. By using the genomic data and environmental descriptions of the complete prokaryotic genomes in NCBI's Microbial Genome Project Database and applying statistical methods, we have identified in a systematic manner those gene groups whose presence/frequency patterns are different for organisms of different environmental conditions. Here environmental conditions are characterized in four dimensions--salinity, oxygen requirement, habitat and temperature, and are based on the controlled vocabularies that NCBI's Microbial Genome Project database uses to specify the organism information; and, gene groups are determined as Clusters of Orthologous Groups (COG) and KEGG Orthology (KO) groups. These identified COG and KO groups are considered as potentially correlated with certain environmental conditions, and are then mapped to the COG general categories and KEGG pathways to determine which part of the functional machinery of prokaryotic cells are correlated with the environments. The observations derived from the analysis of the COG and KO groups that are potentially correlated with the oxygen requirement and habitat conditions are in general consistent with existing studies on properties of organisms living in different conditions of these two environmental factors. To further assess the identified correlation relationships, we have also examined whether the environmental conditions are predictable based on the gene distributions in the selected COG and KO groups. The misclassification rates of the prediction experiments are much smaller than that rendered by random guessing, indicating the existence of the correlation relationships between organisms' environmental conditions and gene distributions in certain functional groups. However, the rather moderate misclassification rates (the 25- and 75-percentiles of the misclassification rates of all prediction experiments are 16.79% and 24.06%, respectively) also indicate that the correlation relationships between environmental conditions and gene distributions in certain functional groups are not strong enough for one to decisively define the other.


Asunto(s)
Ambiente , Metagenómica/métodos , Células Procariotas/fisiología , Biología de Sistemas/métodos , Interpretación Estadística de Datos , Bases de Datos de Ácidos Nucleicos , Metagenómica/clasificación , Modelos Teóricos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...